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PRINCIPAL  COMPONENTS  IN  THE  NONNORMAL  CASE: 

THE  TEST  FOR  SPHERICITY 

Christine  M.  Waternaux 
Harvard  University 

Summary. 

The  limiting  distribution  of  the  likelihood  ratio  statistic  for 

testing  the  hypothesis  of  equality  of  q characteristic  roots  of  a 
covariance  matrix  for  normal  populations  Is  studied  for  nonnormal  popu- 
lations. It  Is  shown,  both  theoretically  and  empirically,  that  the 
limiting  distribution  of  Is  not  robust  to  departures  from  normality 

characterized  by  nonzero  fourth  cumulants  and  that  W cannot  be  used 

q 

for  these  nonnormal  populations.  For  the  class  of  spherically  s}rmmetric 
populations.  It  Is  shown  that  the  limiting  distribution  of  H Is  pro- 

q 

portlonal  to  a chi-square  under  the  null  hypothesis  of  equality  of  q 


population  roots  and  to  a noncentral  chi-square  under  an  appropriate 

* 

sequence  of  alternative  hypotheses.  A corrected  test  statistic,  W^, 
whose  limiting  distribution  Is  chi-square.  Is  proposed  to  test  the 
hypothesis  of  sphericity  for  the  general  class  of  spherically  symmetric 


populations.  Results  of  a Monte  Carlo  experiment  conducted  to  compare 

* 

the  performances  of  and  are  presented  for  various  contaminated 

normal  models  at  various  sample  sizes.  It  is  demonstrated  that  there  Is 


little  difference  between  and  W for  normal  populations,  but  that 

•k 

dramatic  Improvements  are  gained  by  using  for  the  nonnormal  popu- 


lations. 

KEY  WORDS:  Principal  Components.  Robustness. 

Test  for  Equality  of  Variances. 


□ □ 


1.  Introduction.  Principal  Components  Analysis  is  a multivariate 
data  analysis  technique  which  aims  at  reducing  the  dimensionality  of  a 
p-variate  data  set  without  losing  too  much  information.  The  test  for 


sphericity  is  the  critical  test  which  is  performed  in  order  to  separate 
(P"*!)  principal  components  with  large  variances  which  significantly 
contribute  to  the  total  variation  in  the  sample,  from  q components 


with  much  smaller  and  equal  variances  which  explain  a negligible  frac- 
tion of  the  total  variation. 

Let  X be  a p x 1 random  vector.  Let  X,  < X.  < . . . < X be 

1 — 2 - — p 

the  characteristic  roots  of  the  population  matrix  Z,  and  let  < ^2 
< . . . < denote  the  roots  of  the  sample  covariance  matrix  S = (s^^ ) 

for  a sample  of  size  N drawn  from  the  distribution  of  x.  The  vector 
y of  principal  components  is  obtained  by  rotation  of  the  original  vector 
x:  y = r'x,  where  F denotes  the  orthogonal  matrix  of  (ordered)  charac- 
teristic vectors  of  Z.  Then 


var  (i  = 1,  . . . , p)  , 


and  the  hypothesis  of  equality  of  q roots  is: 


: X-  ” X«  — ...— X — X^X  , ^ . . . ^ X . 

0 12  q q+1  p 

If  X has  a multivariate  normal  distribution,  the  likelihood  ratio 


criterion  for  testing  is 


‘i)} 


IK 


and  the  limiting  distribution  for  large  N of  ^q  is  chi- 


square  on  (q+2)(q-l)/2  degrees  of  freedom  (Anderson,  1963) 


The  purpose  of  this  paper  Is  to  show  that  the  limiting  distribution 

of  is  nonrobuat  to  departures  from  normality  In  the  population,  both 

under  the  null  hypothesis  and  under  the  alternative,  and  to  propose  a 

* 

modified  test  statistic  W that  can  be  used  to  test  H.  for  the  more 

q 0 

general  class  of  affine  transformations  of  spherically  symmetric  popula- 
tions (which  include  the  normal  populations). 

The  nonrobustness  of  the  tests  for  the  hypothesis  of  homogeneity  of 
variances  for  p independent  samples  has  been  pointed  out  by  Box  (1953) 
and  alternative  methods  have  been  proposed  by  several  authors:  Levene 


(1960),  Miller  (1968),  Layard  (1973)  and  Brown  and  Forsythe  (1974).  The 
testing  problem  considered  here  for  the  purpose  of  Principal  Components 
Analysis  is  a generalization  in  the  following  sense:  the  p variables 
Xj^(l  * 1,  ...,  p)  are  not  necessarily  independent  and  one  tests  the 
homogeneity  of  variances  of  the  uncorrelated  variables  y^(i  » 1,  ...,  q; 
q ^ p)  obtained  by  an  appropriate  rotation  of  the  x^'s.  The  rotation- 
invariant  test  statistic  is  a function  only  of  the  sample  roots 

Z.(l  * 1,  ...,  q) . The  limiting  distribution  of  W as  N ->  +<» 

1 q 

depends  on  the  kurtosis  of  the  marginal  distribution  of  y^  and  also 

2 2 

on  the  bivariate  fourth  order  cumulants,  cor(y^,y^),  as  does  the 
limiting  distribution  of  the  sample  roots  (Waternaux,  1976).  In  the 
general  case,  adjusting  for  nonzero  fourth  cumulants  by  estimating  them 
is  difficult  and  the  limiting  distributions  of  the  corrected  test  stat- 
istics are  not  simple.  However,  for  the  class  of  spherically  symmetric 

distributions  - the  class  for  which  Principal  Components  Analysis  is 

* 

most  appropriate  and  meaningful  - a robust  statistic  W , whose  limiting 

q 

distribution  as  N -*■  +“  is  also  chi-square,  is  proposed  to  test  for 


2 


sphericity.  For  this  class  of  distributions,  the  significance  level 
and  power  of  the  tests  are  (asymptotically)  valid  and  the  standard 
results  of  Principal  Components  Analysis  can  be  used  (after  adjustment 
for  kurtosls). 

2.  Asymptotic  Theory. 

Assumptions  and  Notation;  Let  x be  a (p  x 1)  vector  from  a 
multivariate  distribution  with  E(x)  = 0 and  for  which  all  fourth 
order  moments  and  crossmoments  exist  and  denote  the  standardized  cumu- 
lants  by  ® ■*  •••>  p)  5 for  example 


i 

IC  = i 3 

4 2 ^ 

(var  x^) 


(i  = 1,  ...,  p) 


2 2 
Etx^.x^] 

'22  var  x,  var  x, 

r j 


- 1 a * j) 


For  the  asymptotic  theory  assume  (without  loss  of  generality)  that 


E = A = diag(X- , . . . , A ) , 
1 P 


and  that  the  (p-q)  largest  roots  are  distinct 


(1) 


q+1  q+2  p 


The  following  standardized  variables  are  constructed  from  the  elements 
of  the  sample  covariance  matrix  S 


(2) 


Zlj  * (s^j  - (i.j  = 1.  P)  , 


where  6^,^  is  the  Kronecker  delta;  let  Z * 


3 


The  probability  law  of  a random  variable  X will  be  indicated  by 

i(X).  The  Mann-Wald  symbols  0 and  o are  Interpreted  in  the  usual 

P P 

sense.  Two  lemmas  will  be  useful. 

Lemma  1 (Slutsky  Theorem) . Let  {x^ } and  {Y^ } be  two  sequences 

of  random  variables.  If  X = Y + o (1)  and  X(Y  ) ->-X(Y)  as  n -►+“>, 

n n p n 

then  £(X  ) ^£(Y)  as  n 
n 


As  a consequence,  if  ^ then  X 


as  n +»  since 


[1+0  (1)] 

P ) 


Lemma  2 (see,  for  instance,  Rao  [1965]) • Let  x be  multivariate 
normal  with  mean  u and  nonsingular  covariance  matrix  V.  A necessary 
and  sufficient  condition  for  x'Ax  to  be  noncentral  chi-square  with 
r degrees  of  freedom  and  noncentrality  parameter  j u'Ay  is  that  AV 
be  idempotent  and  r = rank(AV). 

The  asymptotic  distributions  of  under  the  null  and  alternative 

hypotheses  are  now  found  by  first  deriving  an  asymptotic  expansion  in 
terms  of  the  standardized  variables  . 

2.1.  Asymptotic  Expansion.  The  test  criterion 

",  ■ -4“*  ii  ji  'i)} 

is  rewritten  as 

(3)  Njq  logftrS  - \ £ )-log|s|+  I log  Jl  | . 

I V i=q+l  / i=q+l  ’■ 

The  following  expansions  hold  for  the  determinant  and  trace  of  S: 
log|s|  = logjA  1 + — - + 0 , 

trS  » trA  + . 


4 


It  is  known  that  when  the  population  root  A is  simple  the  r 

r 

sample  root  can  be  expanded  as: 


A.z^ 
i ri 


i = A l+^+:|  I 

M ""  i^r  P 1 


(see  Lawley,  1956);  this  holds  for  r = q+1,  ....  p under  the  assumption 


Substituting  in  (3)  the  expansion  (4)  for  and  the  expansions 

for  logjsl  and  trS,  some  straightforward  algebra  yields  the  asymp- 
totic expansion  for  W 


(5)  - -»  log  p,  + ^ ^ 


? f 2.1^2 

k-1  1-q+l  *l"\  ^ k-1 


4 - ^ ( j,  Vkk) 


where  p, 


+ 0p(N“^^2)  , 


’k  = ^ = q 


Trie  expansion  (5)  is  valid  whether  or  not  the  null  hypothesis 
is  satisfied  provided  that  ^q+i  ^ ^q+2^  ^p’  expansion  does 

not  seem  to  have  been  published  in  the  literature.  Under  the  null 
hypothesis,  A^^  = . . . = A^  = A = A,  the  expansion  of  reduces  to 

(6)  W-  ? , 


^ ij  2 ^ ^ -n 

i,j=l  ^ i=l 

Kj 


where 


2 “ — 7 Zjj 

9 ii 


The  latter  expansion  has  been  given  by  Anderson  (1963),  using  another  method 


Note  that  the  departures  from  sphericity  are  decomposed  Into  two 


parts  in  the  expansion  of  W^:  q(q-l)/2  off  diagonal  terms  Indicating 


correlation  and  (q-1)  diagonal  terms  indicating  heterogeneity  of 
variance. 


It  should  be  noted  that  under  the  null  hypothesis  X 


the  as3nnptotlc  expansion  of  W^,  up  to  terms  of  order  N 


-1/2 


X = A 

q 


, only 


Involves  where  i j £ q and  therefore  the  limiting  distribu- 


tion of  W only  depends  on  the  distribution  of  y^(i  = 1,  q). 


2.2.  Limiting  Distribution  of  Under  the  Null  Hypothesis. 


Proposition  1.  The  limiting  null  distribution  as  N ->-  +<»  of  W 


is  that  of  a linear  combination  of  chi-squares  on  one  degree  of  freedom 
whose  coefficients  depend  on  the  (standardized)  fourth  cumulants  of  the 
principal  components  of  the  parent  population 


K^(i  1,  ...,q)  and  <22  (i>j  — 1»  .••>  qj  i ^ j)  • 


For  normal  populations  (or  for  populations  whose  fourth  cumulants 


are  zero) , the  limiting  null  distribution  of  is  chi-square  on 


(q+2)(q-l)/2  degrees  of  freedom. 

Proof.  By  the  multivariate  central  limit  theorem  the  variables 
are  asymptotically  jointly  multivariate  normal  with  covariance 


ij 


matrix  given  by 


(7) 


var  z 


li 


kJ  + 2 + 0(n“^) 


(i  — 1»  • • • » p)  > 


var  z 


iJ 


<22  + 1 + 0(n”^) 


(l,j  = 1,  ...,  p;  i 5*  j)  , 
(i,j)  (Ji-.m)  . 


1 


Si 


J 


For  normal  populations  the 
the  well  known  result 


are  asymptotically  Independent,  and 


lim  £(W  ) = X^[(q+2)(q-l)/2] 

N++00  ‘I 

follows  immediately  from  (6)  and  Slutsky’s  theorem.  For  nonnorraal  popu- 
lations the  general  dependent  and 


Urn  ) = (<J  + 2)  x^(l)  , 


lim  X(z^  ) = + 1)  x^(l)  . 

N-M-®  J ^ 


and,  in  general,  the  limiting  distribution  of  W is  complicated. 

q 

Finally,  if  the  observations  are  a random  sample  of  p independent 
identically  distributed  variables  with  standardized  kurtosis  K^, 

the  limiting  distribution  of  W is  that  of  a linear  combination  of 

q 

two  Independent  chi-squares 

Urn  £(W  ) = x^[q(q-l)/2]  + (1  + k,/2)  x^(q-l)  • 


2.3.  Limiting  Distribution  Under  Alternative  Hypotheses. 
Proposition  2.  Under  the  alternative  hypothesis  ~ 4 1 

for  some  i = 1,  ...,  q)  the  limiting  distribution  as  N ->•  +«>  of 


(w  + N y 

j=l 

is  normal  with  mean  0 and  variance 


log  Pj)/v^ 


(p  - 1)^  (<^  + 2)  + 2 


I . »i'’i 


i,j=l 

i<J 


ij 
j ''22 


7 


If  the  standardized  variables  (1  = 1,  p)  are  independent 


and  identically  distributed  with  kurtosis  the  variance  reduces  to 


(K/ 


+ 2)  5 - 

j=l  J 


1) 


Proof.  The  result  follows  from  the  asymptotic  expansion  (5)  for  W 


Proposition  3.  (a)  Under  the  sequence  of  alternat Ives 


(Vl=l,  ....  q)  , 


where  a 6 R and  >0,  the  limiting  distribution  for  large  N of 


is  that  of  a linear  combination  of  central  and  noncentral  chi-squares 


on  one  degree  of  freedom.  In  the  particular  case  of  a random  sample  of 
X,  where  the  variables  /X^  x^  are  Independent  and  identically  distri- 
buted with  kurtosis  k^,  the  limiting  distribution  of  is  that  of  a 
linear  combination  of  a central  chi-square  and  an  independent  noncentral 
chi-square 


11m  £(W^)  = x^[q(q-i)/2]  + (1  + k^/2)  Xf^(q-I)  , 


N-*+<» 


where  the  noncentrality  parameter  is 


^ (a^  - a)^/(4  + 2k^) 


1-1 


(b)  Under  the  sequence  of  alternatives 


H'„:  X 
aN 


(VI  - 1,  . . . , q)  , 


the  limiting  distribution  as  N ->  +<»  of  is  the  same  as  in  the 


null  case. 


1 


1 


8 


Proof.  The  result  follows  from  the  fact  that  In  the  case  (a),  under 


P = ^ = 1 + — + 0(N  ) (i  = 1,  . . . , q)  , 

^ A 


and  after  simplifications,  the  asymptotic  expansion  (5)  is  reduced  to 


« =1  ! 


q 2 '“ii  “i 


(z,,  - a - z - a)2  + I z^  + 0 


i.j=l 

i<j 


iJ  P 


Similarly  for  (b),  under  H’ 

3IN 


A S " 

Pi  = ^=l+-S + 0(n‘^)  (i  = l,  ...,q). 


and  the  constant  terms  in  the  expansion  (5)  are  of  order  N . The 


asymptotic  expansion  for  is  then  given  by  (6) , as  in  the  null 


case,  and  the  limiting  distribution  of  is  the  same  as  under  the 


null  hypothesis. 


3.  Distribution  Theory  for  Spherically  Symmetric  Models.  Consider 


the  class  of  p-variate  distributions  which  are  spherically  symmetric 


after  an  appropriate  affine  transformation.  More  precisely,  define 


3 by: 


(8)  the  distribution  of  x,  F(x|l]),  6 3 iff 


(i)  All  moments  and  crossmoments  of  x or  order  < 4 exist 


and  the  covariance  matrix  Z is  nonsingular.  Let  E(x)  = 0 


without  loss  of  generality  and  let  var(x)  = Z = FAF' , as 


previously. 


J 


(li)  The  distribution  of  w « E 


X is  spherically  symmetric. 


-1/2 

that  is:  for  all  orthogonal  (p  x p)  matrix  H,  w and 
Hw  have  the  same  distribution. 

If  the  density  exists,  it  can  be  expressed  in  the  form 

g(x*  x) 

for  some  function  g.  (See  for  instance  Lord,  1954  and  Kelker,  1970.) 

The  class  5 provides  a simple  yet  quite  large  class  of  multivariate 
models  as  an  alternative  to  multivariate  normal  models.  Note  that  the 
standardized  kurtosis  is  the  same  for  all  marginal  distributions 

(i  = 1,  ...,  p).  By  varying  g one  can  generate  both  long-tailed 
(k^  > 0)  or  short-tailed  (<^  < 0)  models;  this  is  particularly 
appropriate  for  the  present  study  since  the  limiting  distributions 
depend  only  on  the  fourth  cumulants  of  the  parent  population.  The  > 

i 

class  ^ includes  in  particular  the  p-variate  normal  model  $(x|E) 
and  the  c-contamlnated  normal  models  with  distribution  function  given 
by 

$^(x|Z,a)  = (1  - e)  ^(xlZ)  + EH(xlaE)  , 

where  0 < e < 1,  a is  a positive  scalar  and  H(x|E)  € 3;  it  also  ; 

Includes  the  multivariate  T distribution. 

It  will  be  shown  that  considerable  simplifications  appear  in  the 
limiting  distributions  of  for  any  population  in  5.  These  simpli- 
fications arise  from  the  fact  that  the  standardized  cumulants  of  the  1 

principal  components  y^  (1  » 1,  ...,  p)  satisfy  then  I 


10 


(9)  *^4  ^ S?  “ 1 i j)  , 


22 

'31  ^1111 


(Vi  ^ j or  £ ^ m)  . 


3.1.  Asymptotic  Distributions  of 

Proposition  4.  For  any  population  F(x|E)  e under  the  null 
hypothesis 

0 12  q q+1  p 

the  limiting  distribution  of  W /(k+1)  as  N ->•  +<»  is  chi-square  on 

q 

(q+2)(q-l)/2  degrees  of  freedom. 

Proof.  As  for  the  general  case,  the  limiting  distribution  under 

of  W is  that  of 
0 q 


T * 1 ^Ij  + 2 ^ • 

i,j“l  ^ i=l 
i<j 


Consider 


u ■ 

- w ^ 


^2V  ^31’  ^32’  • 


^q.(q-i)) 


/v/k+1 


then 


T = (<+l)  u'Hu 


where  H is  the  block  diagonal  matrix 


H = 


I 

q q 


^q(q-l)/2 


(I^  denotes  the  q x q Identity  matrix  and  e'  = (1,  ...,  1)  £ R**.) 
The  limiting  distribution  of  u is  normal  with  mean  0 and  covariance 
matrix 


11 


11m  £(u'Hu)  = X^(r) 

lf+-+oo  ~ '• 


since  V is  nonsingular  and  VH  is  idempotent  of  rank  r - (q+2)(q-l)/2, 
and  Proposition  3 holds. 

Proposition  5.  (a)  For  any  distribution  F € 5,  under  the  sequence 

of  alternatives 

*^aN=  \ “ ■^)  ^ • 

where  a 6 and  > 0,  the  limiting  distribution  as  N ->■  +»  of 
W^/(kH-1)  Is  noncentral  chi-square  on  (q+2)(q-l)/2  degrees  of  freedom 
and  noncentrality  parameter 

f ^ “ a)^/A(ic4-l),  where  a ■ — E a,  . 

1-1  i q i 

(b)  For  any  distribution  F € 3,  under  the  sequence  of  alternatives 


the  limiting  distribution  as  N ■>  +»  of  W^/(ic+l)  is  chi-square  on 
(q+2) (q-l)/2  degrees  of  freedom. 

Proof.  As  in  the  proof  of  Proposition  4,  note  that  in  the  case  (a) 
the  limiting  distribution  of  W is  that  of 

q 


12 


1 ? / - - 2 ? 2 

^ i=l  ^ i.j=l  ^ 

l<j 

and  in  the  case  (b)  the  same  as  under  the  null  hypothesis.  Then  for  (a) 
consider 


- ^ 
\ Jo 


z - a 
.JIS 


’ ^21’  ^31 


the  limiting  distribution  of  v is  normal  with  mean 


’ ^32’  ^q.Cq-l))^*^ 


a. 

a \ 

5’  = (- 

1 

> 

• • . » - o»  0,  ...»  ol 

\ 

/2 

/2  f 

and  covariance  matrix 

V. 

Then,  by  Lemmas  1 and  2, 

//k+I 


lim  £ 

N^+oo 


11m  i;(v’Hv)  = Xf  (^)  • 

N-Hoo 


Proposition  5 generalizes  two  results  given  by  Nagao  (1970)  for  the 
limiting  distribution  of  when  q = p and  the  population  is  multi- 

variate normal.  By  expanding  the  characteristic  function  of  -2  log  A^, 
Nagao  showed  that  under  the  sequence  of  alternatives 


(where  (J  is  a symmetric  positive  matrix) , the  limiting  distribution  of 

’ 2 

is  noncentral  chi-square  Xj-  (r) , with  noncentrality  parameter 

f - ^ (trn)"] 


when  6 « -j*  central  chi-square  on  r degrees  of  freedom  when 
6=1.  For  spherically  symmetric  models  under  the  sequence  of  alter- 
natives 


13 


the  limiting  distribution  of  W / (k+1)  Is  noncentral  chi-square  x, 


4(<+l) 


when  5 


and  central  chi-square  on  r degrees  of  freedom  when 


3.2.  Robust  Test  for  Sphericity.  The  previous  asymptotic  dlstri 


but ions  suggest  a reasonable  method  for  testing  for  sphericity  in  the 
class  5 of  spherically  symmetric  models. 

Proposition  6.  For  any  population  in  the  class  5 defined  by  (8), 


the  test  statistic 


where  k Is  a consistent  estimator  of  k,  is  asymptotically  chi-square 
on  r = (q+2)(q-l)/2  degrees  of  freedom  under  the  null  hypothesis 


and  the  sequence  of  alternatives  noncentral  chi-square 

under  the  sequence  of  alternatives  {H 

aN 

Proof.  The  results  follow  from  Lemma  1 and  Propositions  3 and  4. 

A 

The  asymptotic  null  distribution  of  W is  the  same  for  all  popu- 

* 

latlons  in  5 and  therefore  W can  be  used  to  test  the  hypothesis  of 

q 

sphericity. 

Estimation  of  K.  Under  the  null  hypothesis  • X (1  ■ 1,  . . . , q) , 
one  notes  that 


14 


E(R  ) - E 


E(R  ) - E 


■‘9  ■ 

a ■ 


q(q+2)(K+l)  A , 


where  it  Is  recalled  that  the  are  assumed  to  have  no  correlation. 

Thus  the  nuisance  parameter  K satisfies 


(K+1) 


(E(R^))^ 


and  a consistent  estimator  of  ic  is 


Ul, 

, _g . 1 


where 


" I (il  ■ 

■ » Ji  ill  ■ 

and  is  the  i*"^  component  of  the  observation.  The  sample 

components  are  used  because,  in  practice,  the  population  covariance 
matrix  is  not  necessarily  diagonal.  Note  that  if  p q 

I * ? ’'L  » 


and  it  is  not  necessary  to  use  the  to  estimate  k. 

The  following  points  are  worth  raising  about  the  estimation  of  the 
kurtosis  of  a distribution  in  general,  and  < in  particular,  for  long- 
tailed  populations. 


r 


(1)  The  proposed  estimate  is  not  "robust"  In  the  sense  that 
its  Influence  curve  (see  Hampel,  1973)  is  not  bounded,  and  therefore 
one  can  expect  sensitivity  to  large  outliers.  However  K Is  actually 


[ 


a standardized  kurtosls  and  a large  observation  will  Inflate  both 
numerator  (m^)  and  denominator  (m2).  Also  K Is,  In  this  case, 
computed  by  summing  over  q dimensions,  and  turned  out  to  be  con- 
siderably more  stable  than  the  estimate  of  the  kurtosls  of  a single 
variable. 

(11)  Numerous  attempts  have  been  made  by  the  author  to  produce  a 
robust  estimate  of  kurtosls  such  as  winsorlzing  or  weighting  down  large 
observations,  and  several  estimates  were  empirically  studied  using  Monte 
Carlo  methods.  The  robustified  estimates  exhibited  a considerable 

A 

bias  towards  zero,  making  the  correction  term  (1+k)  ineffectual. 

(ill)  Finally,  further  effort  In  the  estimation  of  K was 
abandoned  when  the  empirical  results  showed  that  the  performances  of 
W^/(<+l),  where  the  true  value  for  < was  substituted  for  each  popu- 
lation (corresponding  to  an  hypothetical  perfect  estimate) , were  not 

A 

superior  to  the  performances  of  W^.  It  seems  that  the  presence  of  a 

A 

large  outlier  Increases  and  K simultaneously,  but  does  not 

* 

affect  W as  severely. 

q ^ 


i 

I 

I 


I 

I 

i 


3.3.  Generalization  to  a Wider  Class  of  Distributions.  It  should 
be  emphasized  that  the  previous  results  hold  for  an  even  wider  class  of 
populations  than  the  class  3.  It  was  remarked  In  Section  2.1  that 
under  the  null  hypothesis  (or  under  a sequence  of  alternatives  close 
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only.  More  precisely,  the  proofs  for  Propositions  3 and  4 depend  only 
on  the  assumption  that  the  asymptotic  covariance  matrix  of  (1  ■ 1, 

q)  has  a structure  of  an  intrarlass  correlation  matrix,  that  Is, 
of  the  form 


(al  + bee') 

q 

and  that  the  asjrmptotic  covariance  of  z 


var  z^^  = (i,j  =1,  ...,  q;  1 ^ j) 


or,  equivalently,  on  the  assumption  that  the  standardized  cumulants  of 
the  population  principal  components  satisfy  (9).  In  summary: 

Proposition  7 . Let  x be  a (p  x 1)  vector  for  which  all  moments 
of  order  4 exist.  Let  Z = FAF'  denote  the  covariance  matrix  of  x. 

A sufficient  condition  for  the  limiting  null  distribution  of  W /(k+1) 

q 

to  be  chi-square  on  r degrees  of  freedom  is  that 


var  y^,  = X (i  = 1,  . . . , q)  , 

where  y = F'x  is  the  (p  x i)  vector  of  the  (ordered)  principal 
components,  and  that  the  standardized  cumulants  of  y satisfy 


*^4  '^22 


•••>  q ! i j ) f 


- 0 
nil  ^ 


(i,j,k,m  < q;  i j or  k m) 


In  particular,  it  is  sufficient  that  the  joint  distribution  of 
(yj^,  ...,  y^)  is  spherically  symmetric. 

4.  Monte  Carlo  Experiment.  A Monte  Carlo  experiment  has  been  con- 
ducted to  study  the  sampling  distributions  of  the  test  statistics  W and 

q 


W^  for  variouspopulatlons  and  sample  sizes  N = 50,  100,  200  under  the 
null  and  alternative  hypotheses.  The  different  populations  considered 
were 

(I)  a multivariate  normal  population, 

(II)  a short-tailed  population  which  Is  a mixture  of  two  normal 
populations  with  different  means, 

and  several  contaminated  normal  populations  with  distribution  function 


I 

I 

' i 

I i 


4)  (x|Z,a)  ■ (1-e)  4>(x|Z)  + eil>(x|ai;) 

(where  4>(x|E)  denotes  the  distribution  function  of  a multivariate 
normal  vector  x with  covariance  matrix  Z) , with  parameters: 


(111)  e - .3 

- 2 , 

> 

(Iv)  e •>  .1 

/a  * 3 . 

The  dimensions  and  population  roots  considered  were 

(I)  p - q * 2 and  * 1 , 

(II)  p ■ q * 6 and  A • A„  = . . . » A,  • 1 , 

X Z D 

(III)  p » 6,  q “ 4 and  * '*■  “ ^4  “ ^5  * ^ • 

For  each  sample  size  200  samples  were  drawn  for  each  population  and  the 

* 

corresponding  values  for  both  test  statistics  W and  W were  computed. 

q q 

* 

The  sampling  distributions  of  W and  W have  then  been  compared  to  the 

q q 

limiting  distributions  given  by  the  asymptotic  theory:  the  central  chi- 
square  on  r degrees  of  freedom  under  the  null  hypothesis,  the  normal 
approximation  as  well  as  the  central  and  noncentral  chi-square  under 
the  alternative. 

The  simulations  showed  that  the  asymptotic  chi-square  approximation  to 
the  distributions  of  W^/<+l  and  Is  good  for  N “ 100,  200,  but 

somewhat  less  accurate  for  N ■ 50;  for  normal  populations,  however. 


ii 


18 


the  asymptotic  approximation  of  the  distribution  of  by  a chi-square 

is  accurate  for  sample  sizes  as  small  as  N = 50.  Tables  1 and  2 present 
some  numerical  results  of  the  simulations. 

It  is  clearly  demonstrated  in  Tables  1 and  2 that  the  Lawley- 
Bartlett  criterion  cannot  be  used  for  testing  the  hypothesis  of 

sphericity  when  the  parent  population  is  nonnormal,  even  for  moderate 
departures  from  normality  as  the  contaminated  normal  e = .3,  0=2 
(the  standardized  kurtosis  is  then  3k  = 1.6).  As  an  example,  for 
N = 200,  p = q = 6,  the  observed  significance  level  of  the  test  based 
on  W is  50%,  for  a nominal  level  of  5%.  On  the  other  hand,  the  test 

q 

* 

based  on  the  proposed  statistic  has  an  observed  significance  level 

of  7.5%,  not  significantly  different  from  the  nominal  level  of  5%.  For 
a normal  population,  the  test  based  on  is  the  likelihood  ratio  test 

ic 

but  test  criteria  W and  W are  then  almost  identical,  and  therefore 

q q 

* 

can  be  used  without  loss  of  efficiency  at  the  normal  model. 

For  short-tailed  models,  as  expected,  the  results  of  the  simulations 
were  not  as  spectacular.  Since  the  kurtosis  of  any  population  should 
satisfy  K^  ^ short-tailed  model  can  produce  the  same  extreme 

behavior  for  as  occurs  for  lont-talled  models.  For  moderately 

short-tailed  models,  the  sampling  distribution  of  is  well  approxi- 

mated by  a chi-square. 
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( 

1 
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TABLE  1.  Number  of  times  (out  of  200) 
p - q =•  6,  N = 200 


W 

q 


and 


* 

W 

q 


exceed 


Model 

a-level 

test 

a « .25 

a * .05 

a = .01 

Normal 

W 

53 

10 

4 

W* 

52 

10 

5 

Cont.  Normal 

e = .3,  o = 2 

W 

146 

99 

37 

(3k  = 1.6) 

w* 

52 

15 

7 

Cont.  Normal 

e = .1,  a • 3 

w 

198 

188 

160 

(3k  = 5.3) 

w* 

53 

13 

6 

exp.  number^ 

50  + 12 

10  + 6 

2 + 3 

I TABLE  2. 

I 


Number  of  times  (out  of  200) 
p » 6,  q - 4,  N = 200 


* 2 

W and  W exceed  y (9) 
q q 


I 

[ 


Model 

a-level 

test 

a * .25 

a » .05 

a * .01 

Normal 

W 

55 

11 

1 

W* 

49 

9 

1 

Cont • Normal 

e * .3,  a - 2 

w 

116 

51 

17 

(3k  “ 1.6) 

w* 

51 

12 

2 

Cont.  Normal 
e = .1,  0*3 

w 

183 

151 

117 

(3k  - 5.3) 

w* 

55 

10 

3 

exp.  number^ 

50  + 12 

10  + 6 

2 + 3 

expected  number 


200a  + 2^'200a(l-a)  j 
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The  limiting  distribution  of  the  likelihood  ratio  statistic  for 

testing  the  hypothesis  of  equality  of  q characteristic  roots  of  a 
covariance  matrix  for  normal  populations  is  studied  for  nonnormal  popu- 
lations. It  is  shown,  both  theoretically  and  empirically,  that  the 
limiting  distribution  of  W is  not  robust  to  departures  from  normality 

‘I''  5-1(5  - - . 

characterized  by  nonzero  fourth  cumulants  and  that  W^'  cannot  be  used 
for  these  nonnonnal  populations.  For  the  class  of  spherically  symmetric 
populations,  it  Is  shown  that  the  limiting  distribution  of  is  pro- 

portional to  a chi-square  under  the  null  hypothesis  of  equality  of  q 

population  roots  and  to  a noncentral  chi-square  under  an  appropriate 

* 

sequence  of  alternative  hypotheses.  A corrected  test  statistic,  W^, 
whose  limiting  distribution  is  chi-square,  is  proposed  to  test  the 
hypothesis  of  sphericity  for  the  general  class  of  spherically  symmetric 

populations.  Results  of  a Monte  Carlo  experiment  conducted  to  compare  ' 

, * 

the  performances  of  W ^ and  W ^ are  presented  for  various  contaminated 

q q 

normal  models  at  various  sample  sizes.  It  is  demonstrated  that  there  is 

A 

little  difference  between  W and  W for  normal  populations,  but  that 

q — — <1 

\ * 

dramatic  improvements  are  gained  by  using\^^  for  the  nonnormal  popu- 
lations. ' 
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